Ecological Informatics — Latest Matching Preprints

1

Integration of Deep-Learning and Species Distribution Models for Classification of Animal Species of the Brazilian Fauna

Oliveira, M. B.; Bernardino, H. S.; Vieira, A. B.; Barroso, A. A.; Augusto, D. A.

2026-05-08 ecology 10.64898/2026.05.06.723365 medRxiv

Top 0.1%

23.3%

Show abstract

The automated classification of animals from photos is important in ecology and conservation biology for organizing and understanding the immense diversity of species, as well as facilitating effective conservation and management practices. It is equally important for disease surveillance systems, allowing prompt detection of anomalies in species distributions and boosting citizen-scientist platforms by making user-reported data more accurate and convenient. Image classification uses photos and can also rely on the geographical locations of animals to improve performance. While image classification models have difficulties in classifying low-quality images, unbalanced datasets, and with a small number of images, species distribution models have difficulty in classifying species that coexist in a given region. We propose here strategies for combining image classification models based on deep neural networks with species distribution models using genetic algorithms. The proposal is applied to a real-world dataset comprising fifteen classes of animals from the Brazilian fauna obtained from Fiocruzs citizen-scientist Wildlife Health Information System (SISS-Geo). The SISS-Geo photos portray the reality of animals in their environments, with varying quality, and pose numerous difficulties for classification. Experimental results demonstrate that the proposed integration consistently outperforms standalone models. While individual SDMs achieve Top-1 accuracies of 27.79% (MaxEnt) and 31.76% (Bioclim), and CNN-based classifiers reach 58.17% with ResNet50 and 64.13% with ResNet-152, the hybrid strategies yield substantial improvements. The genetic algorithm-based integration with a single global weight achieves up to 67.96% Top-1 accuracy, whereas the class-specific integration using fifteen parameters attains the best overall performance, reaching 69.03%.

2

Tuning into the city soundscape: Optimizing Convolutional Neural Networks for avian acoustic identification in the neotropics and evaluating their performance against established monitoring approaches.

Ardila-Villamizar, M.; De Clippele, L. H.; Dominoni, D. M.

2026-05-13 ecology 10.64898/2026.05.10.724049 medRxiv

Top 0.1%

16.8%

Show abstract

Convolutional Neural Networks (CNNs) have become increasingly prominent in biodiversity monitoring due to their strong performance in accurately detecting species from sound recordings, overcoming some limitations of traditional methods such as point-counts. Yet, their use in urban ecosystems remains limited, highlighting the need for frameworks that identify modelling strategies to optimize their performance in these complex soundscapes. Here, we evaluated how preprocessing and labelling strategies, detection thresholds, sample size, and architecture affect the performance of CNNs for bird identification in urban tropical ecosystems. We also assessed its potential by comparing CNN-derived biodiversity estimates with those from point-counts and acoustic indices. For this, we used one week of recordings collected along urbanization gradients in five Colombian Andes cities to developed 11 multiclass CNN models varying in spectral representation, labelling strategies, training data source and backbone architecture. The best-performing model, evaluated with F1-scores, combined Log-Mel spectrograms, multispecies labels, ecosystem-specific recordings, a probability threshold of 0.3 and a ConvNeXt backbone with its performance generally improving with sample size. Although CNNs and point counts detected partially distinct assemblages, CNN-derived species richness was comparable to that estimated from point-counts. In addition, the Normalized Difference Soundscape Index (NDSI) was positively associated with richness, suggesting its potential as a biodiversity proxy in tropical urban soundscapes. Overall, by identifying effective modelling designs and monitoring strategies, our study advances the development of robust biodiversity assessment frameworks in urbanized ecosystems in the Neotropics whilst also providing methodological guidance for future research and practical insights for wildlife monitoring and conservation.

3

The morphotype approach to classification of aerial animals in radar data

Werber, Y.

2026-03-26 ecology 10.64898/2026.03.24.713935 medRxiv

Top 0.1%

14.7%

Show abstract

Radar aeroecology is dedicated to making ecological inference about aerial wildlife from radar-derived information. While producing unique, large-scale datasets describing biological activity in the sky, radar methodologies are largely incapable of relating these to specific species and are thus taxonomically limited. I describe a computational method to increase taxonomic resolution in vertical looking radar data by dividing detected organisms into morphology and movement-based aerial morphotypes. Using the Birdscan MR1 radar target classifier, wing flapping frequency calculation and target size estimation, I demonstrate a nearly 8 fold increase in classification resolution of bird radar data from the Hula Valley Research station, Israel. Furthermore, by relating each species in the regions species pool to its relevant morphotype, I show that most of these newly separated classes are related to small numbers of species (1-10), providing realistic opurtunities to bridge the taxonomy gap in radar data. By using the morphotype approach, radar aeroecologists can start observing and discussing the concept of "Aerodiversity", analogues to widely used biodiversity, a fundamental measure in ecology and conservation sciences. By analitically adressing taxonomy in radar-aeroecology, practitioners will increase the impact and dissemintation of their work and contribute to a better, more complete understanding of the aerial habitat.

4

AI for Fisheries Science: Neural Network Tools for Forecasting, Spatial Standardization, and Policy Optimization

Kapur, M.; Adams, G.; Lapeyrolerie, M.; Thorson, J. T.

2026-03-17 ecology 10.64898/2026.03.13.711664 medRxiv

Top 0.1%

14.5%

Show abstract

The development of Artificial Intelligence (AI) presents novel opportunities for tackling complex marine resource management challenges. Among AI models, neural networks are a powerful class of tools capable of learning nonlocal and lagged patterns from fisheries data as well as approximating nonlinear relationships among multiple latent variables using estimation methods that automatically implement statistical shrinkage. This gives them potential to effectively handle data obtained from fisheries populations subject to dynamic environments. We highlight two flexible subclasses and one application of neural networks: Long Short-Term Memory (LSTM) and Convolutional Neural networks (CNNs) and policy discovery via Reinforcement Learning. LSTMs are designed to handle sequential data by allowing prediction from past values at both short and long time-lags. CNNs are not explicitly designed to handle temporal information, but can interpolate a spatial latent variable based on its value within a geographic neighborhood, and can be combined with LSTM models for this purpose. This "Food for Thought" paper introduces and applies these neural network approaches, both alone and in combination, to demonstrate their potential application for several foundational topics in fisheries science: 1) the forecasting of population weight-at-age, 2) the standardization of spatio-temporal indices of relative abundance, and 3) the discovery of harvest policies to optimize catches and maintain spawning biomass. Each section provides a simple, simulated example and describes the tradeoffs - particularly the lack of inferential capability - presented by using neural networks over traditional approaches for each topic. We then outline medium-term research questions that may clarify, facilitate or qualify the applicability of these tools to fisheries management science. Finally, we discuss how future combinations of these approaches could result in simplified ways to estimate and forecast stock biomass and provide harvest advice.

5

Overcoming software bottlenecks for scalable passive acoustic monitoring: insights from a global expert assessment

Malerba, M. E.; Perez-Granados, C.; Bell, K.; Palacios, M. M.; Bellisario, K. M.; Desjonqueres, C.; Marquez-Rodriguez, A.; Mendoza, I.; Meyer, C. F. J.; Ramesh, V.; Raick, X.; Rhinehart, T. A.; Wood, C. M.; Ziegenhorn, M. A.; Buscaino, G.; Campos-Cerqueira, M.; Duarte, M. H. L.; Gasc, A.; Hanf-Dressler, T.; Juanes, F.; do Nascimento, L. A.; Rountree, R. A.; Thomisch, K.; Toledo, L. F.; Toka, M.; Vieira, M.

2026-04-01 ecology 10.64898/2026.03.30.715176 medRxiv

Top 0.1%

12.8%

Show abstract

Passive acoustic monitoring (PAM) enables non-invasive sampling of wildlife across broad spatial, temporal and taxonomic scales. Its ongoing and widespread use has generated unprecedented volumes of acoustic data, shifting the primary bottleneck from data collection to the storage, processing, integration, and interpretation of PAM outputs. Although many software tools exist to address these challenges, differences in their design, scope, and usability often create fragmented and complex analytical workflows. To identify the key barriers and opportunities shaping the implementation of PAM surveys, we conducted a structured expert solicitation involving 30 international practitioners working across terrestrial and aquatic ecosystems. Experts identified and ranked their most critical pain points in current PAM workflows, spanning data storage, processing, and interpretation. The top challenge identified related to accurate species identification using deep learning and artificial intelligence (AI) models, especially in noisy soundscapes or for underrepresented taxa. Eight additional priority challenges included workflow fragmentation, limited availability of user-friendly analytical and visualisation tools, uneven access to software, manual validation bottlenecks, computational constraints, and difficulties in data handling, standardisation, and sharing. Participants also proposed practical mitigation strategies for these priority challenges, supported by step-by-step guidance to help overcome key barriers. Together, these insights provide a roadmap toward more scalable, open-access, and collaborative software systems, which are increasingly essential to realise the full potential of PAM in global biodiversity monitoring.

6

A comparison of BirdNET, expert listening and acoustic indices to monitor avian diversity in a Mediterranean agricultural landscape

Akoglu, I.; Bacak, E.; Bilgin, S.; Boyla, K. A.; Duran, M.; Akcay, C.; Ertor-Akyazi, P.

2026-05-21 ecology 10.64898/2026.05.20.726349 medRxiv

Top 0.1%

12.7%

Show abstract

Passive acoustic monitoring poses an immense potential to assess avian diversity in many habitats, including agricultural landscapes. At the same time, automated recorders generate large datasets which present a challenge for processing and effectively assessing biodiversity. Methods such as manual listening by experts, automated detection algorithms like BirdNET and calculating acoustic indices all present different trade-offs in assessment of biodiversity through passive acoustic monitoring. In the present study we recorded soundscapes in a low-intensity agricultural landscape in western Turkiye in all four seasons. Two expert ornithologists listened to a subset of these recordings identifying bird species from the recordings. We also ran the same sample of recordings on BirdNET to compare BirdNET detections with expert detections and calculated acoustic indices for each recording. The results showed that BirdNET detected more species than experts, although some may not be reliable detections. Two acoustic indices (bioacoustic index and acoustic complexity index) were correlated positively with number of species detected by experts and one (normalized difference soundscape index) with number of species detected by BirdNET but the correlations were modest. The results show that acoustic indices may have limited value in detecting biodiversity and automated detection algorithms may do a better job, although these may need to be trained with local data to improve detection and classification.

7

FishWIO: a labeled image dataset of Western Indian Ocean reef fishes for training and testing classification algorithms

Fleure, V.; Villeger, S.; Claverie, T.

2026-03-04 ecology 10.64898/2026.03.03.709272 medRxiv

Top 0.1%

11.0%

Show abstract

Monitoring fish communities is essential for understanding biodiversity dynamics and coral reef ecosystem health. Underwater imaging provides a non-invasive and repeatable approach for such monitoring, yet analysis of large volumes of video data remains extremely time-consuming for experts. Resolving such a bottleneck is today within reach, yet towards automated fish identification, large and high-quality, labelled image datasets are critical for training and testing reliable deep learning models. However, to date, no such dataset exists for the Western Indian Ocean (WIO), a global biodiversity hotspot hosting more than 300 common non-cryptobenthic fish species and facing increasing anthropogenic pressures. This paper presents a novel and publicly available dataset of 114,664 images annotated from 186 videos recorded using fixed underwater cameras on shallow reef habitats from Mayotte archipelago. All images were labelled and validated by trained marine biologists following a standardized protocol. Each image includes detailed metadata describing recording conditions. The dataset comprises 124 reef fish species (including 110 with >200 images) and 8 background classes. This dataset will allow training and testing automated fish classification models.

8

An explainable machine learning consensus framework for robust estimations of environmental effects on population dynamics

Dhananjanie, A.; Thompson, H.; Vercelloni, J.; Warne, D. J.

2026-05-13 ecology 10.64898/2026.05.10.724190 medRxiv

Top 0.1%

10.4%

Show abstract

Explainable machine learning (ML) methods are gaining increasing attention in environmental and ecological research for their ability to reveal relationships between environmental drivers and population dynamics. However, there remain questions on the reliability of these tools, especially given recent research shows that these explanations can be highly sensitive to model architecture. In ecology, it is typical to use a single ML model, and a comparative evaluation of sensitivity of explainability for different ML approaches is overlooked. In this paper, we develop a novel framework that quantifies explanation consistency between multiple ML model architectures. This framework provides a discrepancy measure for each model prediction, with high discrepancy indicating substantive explanation disagreement across models and low discrepancy indicating strong consensus in explanations across models. We then demonstrate that low explanation discrepancy aligns well with ground truth mechanism. Furthermore, high explanation discrepancy provide a mechanism to identify areas for model refinement and further investigation by domain experts. We do this by using a simulation study based on synthetic coral cover data that incorporate spatio-temporal variability driven by known disturbance effects. Our method provides a quantitative approach to assess the sensitivity of explainable ML in the absence of ground truth. As a result, this enhances the utility of ML approaches in conservation and ecological management. While we focus primarily on ecological modelling for coral reefs, our methods are generally applicable to other ecological and environmental modelling settings.

9

BatSpot: a retrainable neural network for automatic detection and classification of bat echolocation and detection of buzzes and social calls

Smeele, S. Q.; Hauer, C.; Bergler, C.; Dechmann, D. K. N.; Dietzer, M. T.; Elmeros, M.; Fjederholt, E. T.; Fogato, A.; Kohles, J. E.; Noeth, E.; Brinkloev, S. M. M.

2026-03-13 ecology 10.64898/2026.03.11.711063 medRxiv

Top 0.1%

10.3%

Show abstract

O_LIBats are a diverse taxonomic group that display a wide range of interesting behaviours. Many bats are keystone species for their ecosystem, are IUCN Red-listed as vulnerable to critically endangered, and subject to human-wildlife conflicts arising from anthropogenic expansion. Yet bats remain understudied both with respect to behaviour, population ecology and conservation status. One of the major challenges when studying bats is obtaining data. Their nocturnal lifestyle and use of ultrasonic echolocation makes them difficult to track and record using traditional methods. Recent advances in passive acoustic monitoring have allowed researchers to record large amounts of data, but the detection and classification of vocalisations remain a challenge. Most available tools are either for profit or are limited to a narrow geographic range, and mostly focus on echolocation search phase calls. C_LIO_LIHere we present BatSpot, a convolutional neural network trained to detect search phase calls, buzzes and social calls. It also offers the option to classify the search phase calls to species(-complex) level. We provide a GUI that allows researchers to retrain or transfer-train the models for their specific needs and validate the performance. C_LIO_LIWe test the performance of all models and show that they perform better than both commercial and open-source solutions (search phase file level F1: 0.97 vs 0.96, buzz detector F1: 0.95 vs 0.11). We furthermore show that retraining the search phase call detector for a new country with examples from just 59 recordings massively improves the performance (F1: 0.48 to 0.79). C_LIO_LIBatSpot will enable bat researchers globally to automate detection and classification with minimal effort and includes novel options for social call and buzz detection, typically not featured in other automated tools for bat monitoring. C_LI

10

Coral restoration alters reef soundscapes but machine learning and manual analyses suggest different recovery rates

Croasdale, E. M.; Saponari, L.; Dale, C.; Shah, N.; Williams, B.; Lamont, T. A. C.

2026-04-02 ecology 10.64898/2026.03.31.710564 medRxiv

Top 0.1%

10.3%

Show abstract

Coral restoration is recognised as a critical tool to mitigate pantropical degradation of reef ecosystems. Robust monitoring of restoration progress is crucial for projects to evaluate their success, improve practice, and share knowledge. However, traditional visual surveys often fail to capture the full impact of coral restoration on reef function. Therefore, we employed Passive Acoustic Monitoring (PAM) to assess whether the soundscape of a coral restoration site in the Seychelles differs from adjacent healthy and degraded reference reefs. We applied two methods of soundscape analysis: manual detection of unidentified fish sounds; and machine learning-based Uniform Manifold Approximation and Projection analysis. Results were approach-specific: the manual approach highlighted similarities in fish calls between the restoration site and the healthy reference reef, while the machine learning approach extracted broader soundscape patterns, clustering the restoration site alongside the degraded reference reef. Although this is a single-site study, these findings suggest that a) coral restoration alters reef soundscapes, though recovery time may be taxon-specific, and b) multiple metrics are needed to bridge single-taxon and broad soundscape scales. This study contributes to the evolving field of soundscape ecology in coral reef ecosystems, highlighting the utility of PAM in monitoring changes to reef function through coral restoration.

11

Passive acoustic monitoring as a tool for early detection of invasive animal species: a systematic review and a case study

Becker, D.; Kasten, M. K.; Weber, T.; Grass, I.; Hiller, T.

2026-04-28 ecology 10.64898/2026.04.24.720520 medRxiv

Top 0.1%

9.8%

Show abstract

Invasive animal species are spreading rapidly across the globe, creating an urgent need for efficient early-detection and monitoring tools. Passive acoustic monitoring has become an established method in biodiversity research, but its application to invasive species monitoring has been less systematically explored. Here, we combine a systematic literature review with a field-based case study to evaluate the potential of passive acoustic monitoring for invasive animal detection. We identified 26 studies on acoustic monitoring of invasive animals, mainly addressing amphibians (11 studies), birds and fish (five each) with most studies from the USA and Australia. The use of acoustic monitoring of invasive species has increased during the past decade, with recent studies applying automated detection, machine learning, and large-scale monitoring frameworks. As a case study, we further tested the feasibility of low-cost acoustic monitoring of the invasive American bullfrog (Lithobates catesbeianus) in southwestern Germany, combined with automated identification using BirdNET. We successfully confirmed bullfrog presence in eight of the eleven monitored lakes, including sites close to a protected nature reserve. Our results highlight the growing potential of passive acoustic monitoring of invasive species under field conditions. In combination with automated species detection, manual validation, and emerging real-time monitoring devices, passive acoustic monitoring becomes an increasingly powerful tool for early intervention and scalable management of biological invasions.

12

Reliability and Spatiotemporal Autocorrelation of Acoustic Indices: Implications for Biodiversity Monitoring

Jiang, X.; Zhang, Y.; Shu, Z.; Xiao, Z.; Wang, D.

2026-03-20 ecology 10.64898/2026.03.18.712292 medRxiv

Top 0.1%

8.5%

Show abstract

Passive acoustic monitoring (PAM) is increasingly applied in biodiversity research, yet its reliability as a proxy for biodiversity remains insufficiently evaluated. In particular, the spatiotemporal autocorrelation inherent in acoustic indices of PAM is rarely quantified, despite its importance for the standardized application of acoustic monitoring. We conducted an integrated study to investigate these issues using a complete grid-based monitoring system covering the entire region (100 grids of 1 km x 1 km) in southern subtropical climatic zones. Acoustic data from 58 valid sites were combined with camera-trapping and vegetation surveys to evaluate six commonly used acoustic indices in PAM. We found that these indices were more strongly associated with relative abundance and community diversity metrics of bird and mammal than with species richness. Spatially, autocorrelation ranges of some acoustic indices extended to approximately 4 km (i.e., the Bioacoustic Index (BIO) and Normalized difference soundscape index (NDSI)). Temporally, all indices exhibited significant autocorrelation over 2-5 days, exceeding the typical short-term turnover of bird and mammal activity (1-2 days). Our results indicate that acoustic indices are not direct proxies for species richness but provide complementary information on soundscape dynamics. By explicitly quantifying spatiotemporal autocorrelation, this study offers practical guidance for sampling design and statistical analysis in passive acoustic monitoring, supporting more reliable and efficient biodiversity assessment.

13

Automated morphometry and weight prediction of juvenile Chinook Salmon leveraging open-source deep learning models

Knight, B.; Jeffres, C.

2026-03-12 ecology 10.64898/2026.03.10.710725 medRxiv

Top 0.1%

7.0%

Show abstract

Minimizing handling of threatened and endangered fish has become increasingly important as populations have dwindled. To minimize handling in morphometric measurements, the HandsFreeFishing program has been developed for juvenile Chinook Salmon (Oncorhynchus tshawytscha). By segmenting a 2D image many morphometric measurements are able to be estimated; from these measurements a weight prediction model is built based on fish whose ground truthed weights were measured using a digital scale. While many segmentation methods may be used, here Metas Segment Anything model (SAM) is employed to produce segmentation masks of raw images. This model is open-source and easily used on any image (of any size) with good performance. In the proposed framework, the user supplies a bounding box around a target fish along with minimal orientation data (left or right facing, upside down or right-side up); the rest of the segmentation, feature extraction, and final weight prediction is completely automated. A main goal of the segmentation is to estimate the surface area of the side profile of the fish. Then, assuming an ellipsoidal shape, this surface area can be related to the volume of the fish, which is directly proportional to the weight. Even on a relatively small dataset of 149 images (fork length 27-90mm) our results confirm the predictive qualities of the morphometric features measured. The model achieved weight prediction with a mean absolute error of 0.16 g with a mean absolute percentage error of 12%, and an r-squared value of 0.99, on fish ranging from 0.31g - 7.74g. The raw images come from a variety of fish viewers, the design of which is relatively inexpensive and reproducible, and, in conjunction with the HandsFreeFishing program, allows for minimal handling compared to traditional length and weight measurement methods.

14

An Open Reproducible Framework for CNN-Based Cetacean Vocalization Detection in Passive Acoustic Monitoring

De Marco, R.

2026-05-06 animal behavior and cognition 10.64898/2026.05.01.721665 medRxiv

Top 0.1%

7.0%

Show abstract

This paper presents a six-stage methodological framework for Convolutional Neural Net-work (CNN)-based cetacean vocalization detection and classification in Passive Acoustic Monitoring (PAM), implemented as the open-source toolkit ai-pam-pipeline. The frame-work is generalizable across species and fully parameterised through a single configuration file, guaranteeing exact experimental reproducibility. Two experiments are reported. Experiment A examines the effect of FFT window length Nfft [isin] {256, 512, 1024} on binary Bottlenose dolphin (Tursiops truncatus) whistle detection using stratified 10-fold cross-validation on an in-domain dataset (Oltremare, 192 kHz) and a cross-domain benchmark (DCLDE 2022). In-domain performance is uniformly high (macro F1{approx} 0.98; Wilcoxon, all p > 0.05). Cross-domain results diverge substantially: Nfft = 256 is significantly superior (p = 0.006, rank-biserial r = 0.89). The mechanism is an upsampling amplification effect: coarser spectral bins produce wider, higher-contrast FM traces after bilinear resampling to fixed image dimensions. This superiority is threshold-invariant: precision equals 1.000 across all configurations and thresholds{theta} [isin] [0.1, 0.9], confirming that the advantage is not an artifact of threshold choice. These findings demonstrate that preprocessing choices -- often treated as secondary implementation details -- can significantly affect cross-domain generalisation. While Nfft serves here as a controlled case study, the framework is designed to enable systematic, reproducible evaluation of arbitrary preprocessing parameters within a unified experimental protocol. Experiment B demonstrates multiclass capability on five T. truncatus vocalization cate-gories (macro F1 = 0.843); inter-class confusion between click trains and burst-pulse sounds reflects biological signal overlap rather than classifier failure.

15

BioDCASE: Using data challenges to make community advances in computational bioacoustics

Stowell, D.; Nolasco, I.; McEwen, B.; Vidana Vila, E.; Jean-Labadye, L.; Benhamadi, Y.; Lostanlen, V.; Dubus, G.; Hoffman, B.; Linhart, P.; Morandi, I.; Cazau, D.; White, E.; White, P.; Miller, B.; Nguyen Hong Duc, P.; Schall, E.; Parcerisas, C.; Gros-Martial, A.; Moummad, I.

2026-04-06 animal behavior and cognition 10.64898/2026.04.02.716062 medRxiv

Top 0.1%

6.3%

Show abstract

Computational bioacoustics has seen significant advances in recent decades. However, the rate of insights from automated analysis of bioacoustic audio lags behind our rate of collecting the data - due to key capacity constraints in data annotation and bioacoustic algorithm development. Gaps in analysis methodology persist: not because they are intractable, but because of resource limitations in the bioacoustics community. To bridge these gaps, we advocate the open science method of data challenges, structured as public contests. We conducted a bioacoustics data challenge named BioDCASE, within the format of an existing event (DCASE). In this work we report on the procedures needed to select and then conduct useful bioacoustics data challenges. We consider aspects of task design such as dataset curation, annotation, and evaluation metrics. We report the three tasks included in BioDCASE 2025 and the resulting progress made. Based on this we make recommendations for open community initiatives in computational bioacoustics.

16

Addressing Data Fragmentation in Biodiversity: A Workflow for integrated Species Distribution Models

Perrin, S. W.; Adjei, K. P.; Mostert, P.; Togunov, R. R.; Herfindal, I.; Topper, J. P.; Grytnes, J.-A.; Chipperfield, J.; O'Hara, R. B.; Finstad, A. G.

2026-05-21 ecology 10.64898/2026.05.19.721053 medRxiv

Top 0.1%

6.1%

Show abstract

AimA comprehensive understanding of the spatial distribution of biodiversity is hindered by fragmented datasets, sampling biases, and inconsistent observation protocols. Here, we present a workflow that integrates disparate datasets to produce large scale maps of biodiversity metrics as a basis for management-relevant information tools. We use integrated species distribution modeling (iSDM) to account for sampling biases and disparate data collection techniques, taking advantage of the vast numbers of open datasets available in data aggregators like GBIF. LocationNorway (excluding Svalbard and Jan Mayen) TaxonVascular plants MethodsThe workflow consists of four main steps: data acquisition, data integration, integrated species distribution modelling (iSDM), and the production of derived outputs. Input data include structured surveys, opportunistic observations, and environmental covariates. These are standardised and integrated into a point-processed based iSDM framework to produce species richness maps, associated uncertainties, and sampling effort maps. The outputs are further processed to identify biodiversity hotspots or to summarise species-environment relationships. The workflow used vascular plant data from Norway, combining occurrence-only and presence-absence datasets with environmental covariates. Outputs were generated at a spatial resolution of 500 x 500 meters, balancing accuracy, computational feasibility and relevance for management decisions. High-performance computing resources were utilized for model fitting and predictions. A subset of available data was used to validate the species richness maps. ResultsWe produced detailed maps of species richness, uncertainties and sampling intensity across Norways heterogeneous landscape, incorporating 1218 species in our final results. The species richness patterns highlight patterns consistent with previous mapping efforts. Validation showed an increase in model accuracy when compared to models which did not use an iSDM framework. The workflow highlights limitations in the infrastructure of the currently openly accessible data, particularly the need for more structured presence-absence datasets and standardized metadata. Main conclusionsThis study underscores the potential of workflows that integrate disparate datasets for biodiversity modeling. To maximize accuracy and utility, future efforts should focus on improving data standardization, the publication and collection of more structured data, and fostering data-sharing collaborations. Advances in the workflow itself, including optimising modelling covariates and integrating more comprehensive spatio-temporal aspects, will also increase the relevance of the outputs. These advances will increase our ability to estimate species richness with a precision and accuracy that can reliably inform conservation and management decisions.

17

Enhancing the Understanding of Environmental Microbiomes through Topic Modeling: A Quantitative and Qualitative Analysis

Kujat, A. S.; Hassenrück, C.; Lüdtke, S.; Labrenz, M.; Sperlea, T.

2026-05-01 ecology 10.64898/2026.04.28.721390 medRxiv

Top 0.1%

5.0%

Show abstract

BackgroundUnderstanding ecosystem dynamics is essential for assessing ecosystem health, yet remains challenging due to complex biotic and abiotic interactions. Microbial communities are valuable indicators of environmental change, but the high dimensionality of microbiome data requires advanced analytical methods. This study explores the use of topic modeling (TM), an unsupervised machine learning approach initially designed for text analysis, to analyze microbiome data from the dynamic Warnow Estuary on the southern Baltic Sea coast. ResultsWe applied TM to estuarine microbiome data and compared its performance to traditional dimensionality reduction methods, Principal Component Analysis (PCA) and Principal Coordinate Analysis (PCoA). Quantitative results indicate that TM performs comparably to conventional approaches in preserving ecological and functional information, and in certain aspects even superior. In addition, we show qualitatively that NNMF, a TM method, captures latent patterns in the data providing an interpretable perspective on the microbiome. In this exploratory framework, NNMF suggested five distinct sub-communities within the estuary that appear to follow a seasonal succession influenced by freshwater inflow. These sub-communities were associated with specific ranges of salinity and temperature and showed distinct taxonomic profiles, with shared characteristics across the estuarine system. ConclusionsOur findings suggest that TM is a useful tool for exploring complex environmental microbiome datasets, offering a complementary perspective that can provide additional ecological insights. TMs ability to highlight coherent microbial community patterns indicates its promise for supporting environmental monitoring and informing targeted ecosystem management in dynamic habitats, though further studies are needed to fully assess its applicability.

18

The OS-Prey (Omnibus Study of Prey) database: A compilation of diet records for birds of prey.

Uiterwaal, S. F.; La Sorte, F. A.; Coblentz, K. E.; DeLong, J. P.

2026-03-31 ecology 10.64898/2026.03.28.714998 medRxiv

Top 0.1%

4.9%

Show abstract

MotivationThe diet composition of a predator is a direct reflection of its role in a food web, resulting from interactions with prey species. Raptors (including hawks, owls, and falcons) are ubiquitous predators with diverse diets, yet there is no comprehensive database of raptor diet composition. We present a database of over 3500 raw raptor diet records, compiled from more than 1000 studies and representing 173 raptor species from across the world. Our dataset complements existing qualitative summaries of species diets by compiling thousands of quantitative diet "samples" over time and space to present diet data at a uniquely fine resolution. Main types of variable containedThe database comprises published records of raptor diets from pellets, prey remains, direct or photographic observations, prey DNA, and raptor gut or gullet contents. For each diet, we present the taxonomic identity and amounts of consumed prey. We additionally present various metadata for each diet such as location, habitat, and season. Spatial location and grainThe study incorporates diet records collected worldwide, with each record assigned geographic coordinates corresponding to the location where the diet information was obtained. Time period and grainThe database includes diet records from 1893 to 2025. We report a year for each diet record. Major taxa and level of measurementWe recorded raptor diet at the species level, including raptors from three orders: Strigiformes, Falconiformes and Accipitriformes excluding vultures. Most prey are identified to species, but prey taxonomic level varies depending on the extent to which they could be identified. Software formatDiet records and metadata are provided in two files with comma-separated value (.csv) format.

19

Quantifying the vocal repertoire of adult common terns (Sterna hirundo )

Zogby, D. S.; Eddington, V. M.; Craig, E. C.; Kloepper, L. N.

2026-05-22 animal behavior and cognition 10.64898/2026.05.20.722623 medRxiv

Top 0.1%

4.8%

Show abstract

Common terns (Sterna hirundo) are regionally threatened migratory seabirds that form large breeding colonies during the North American summer months. They are highly vocal and serve as important bioindicators of aquatic ecosystems. Historically, acoustic studies on colonial seabirds have proven difficult due to the dense aggregations of individuals and high rate of call overlap. However, as passive acoustic monitoring (PAM) becomes increasingly common for studying seabird colonies, quantitative descriptions of species vocalizations are needed to accurately interpret behavioral information from colony soundscapes and support automated analysis of large acoustic datasets. This study aims to quantify the vocal repertoire of adult common terns. We deployed AudioMoths to collect acoustic data at a tern colony on Seavey Island, New Hampshire, USA from across the breeding season. Using RavenPro, unique call types were identified through visual and aural inspection of the acoustic data in the spectrogram. For each call, we then extracted measurements of peak frequency (Hz), bandwidth 90% (Hz), syllable duration 90% (s), and total bout duration (s) to quantify the characteristics of each call type. Statistical analyses for acoustic parameters by call type were performed using Kruskal-Wallis tests, followed by post-hoc Dunn tests. Our results demonstrate that each call type is significantly different from another by at least one parameter, with the exception of the kek and kip/tjuk calls. These findings present the first quantitative analysis of common tern vocalizations for North America. By defining temporal and spectral characteristics for multiple call types, this work helps translate colony soundscape into biologically meaningful information about tern behavior and colony dynamics. These descriptions also provide key parameters for developing automated tools to detect and classify vocalizations in dense, noisy colonies. Integrating quantified vocal characteristics with PAM offers a promising approach for monitoring colony activity and behavior while minimizing disturbance relative to traditional methods.

20

Automated bird flight pattern extraction and classification using machine learning

Ostojic, M.; Sethi, S.

2026-03-19 ecology 10.64898/2026.03.17.712367 medRxiv

Top 0.1%

4.8%

Show abstract

With bird populations across the world being impacted by ever-growing anthropogenic pressures, reliable monitoring is essential to help halt or reverse declines. Existing visual bird monitoring approaches, which employ cameras or radars to deliver automated and large-scale monitoring data, face a variety of issues. Image-based species classification is only possible if the fine-scale features of a bird are clear, which can be difficult to achieve in real monitoring contexts without expensive, high-resolution cameras due to occlusion and lighting. Radar and video-based approaches which analyse longer-term flight behaviour over the course of seconds can achieve more reliable results in real monitoring contexts, particularly from greater distances, but still require expensive equipment and do not account for all the possible types of flight patterns. Here we present a novel approach to track a wide range of bird flight patterns using inexpensive equipment. As a proof-of-concept, we demonstrate how our approach can be used to classify birds between four species, Red Kite, Kestrel, Black-Headed Gull and Sparrowhawk, which represent four different types of flight patterns. The balanced accuracy of the classification is 0.5583, with a recall and precision per species that range from 0.2640-0.7750 and 0.4583-0.5962, respectively. Our proof-of-concept study demonstrates how new and existing visual bird monitoring systems can leverage flight patterns to deliver species-level insights at lower costs and on larger scales than before.